Decision Key-Value Feature Construction for Multihoming Big Data Network
نویسندگان
چکیده
The random forest algorithm under the MapReduce framework has too many redundant and irrelevant features, low training feature information, parallelization efficiency when dealing with multihoming big data network problems, so parallelism is based on information theory, norms proposed for (PRFITN). In this paper, technique used first builds a hybrid dimensional reduction approach (DRIGFN) focused gain Frobenius norm, successfully reducing number of features; then, an theory offered. This results in dimensionality-reduced dataset. Finally, suggested Reduce stage. features are grouped FGSIT strategy, stratified sampling employed to assure quantity building decision tree forest. When datasets provided as key/value pairs, it common want aggregate statistics across all objects same key. To acquire global classification achieve rapid equal distribution key-value pair redistribution method (RSKP) used, which improves cluster’s parallel efficiency. provides superior impact large networks, particularly numerous characteristics, according experimental findings. We can utilize selection extraction together. addition minimizing overfitting redundancy, lowering dimensionality contributes improved human interpretation cheaper computing costs through model simplicity.
منابع مشابه
Cover Feature Big Data
Today’s applications often contain datasets that are too big to fit in a single computer’s main memory. Analyzing these massive datasets will require scalable and sophisticated machine-learning methods. Two commonly used approaches are stochastic optimization and inference algorithms,1 which process one data point at a time; and distributed computing based on the MapReduce framework,2 where the...
متن کاملApplication of Big Data Analytics in Power Distribution Network
Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...
متن کاملMassively-Parallel Feature Selection for Big Data
We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of p-values of conditional independence tests and meta-...
متن کاملKey Technologies for Big Data Stream Computing
As a new trend for data-intensive computing, real-time stream computing is gaining significant attention in the Big Data era. In theory, stream computing is an effective way to support Big Data by providing extremely low-latency processing tools and massively parallel processing architectures in real-time data analysis. However, in most existing stream computing environments, how to efficiently...
متن کاملFeature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm
This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the measure...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Wireless Communications and Mobile Computing
سال: 2023
ISSN: ['1530-8669', '1530-8677']
DOI: https://doi.org/10.1155/2023/2977126